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Abstract 

In 1990, the Kentucky Educational Reform Act (KERA) mandated a complete 
restructuring of the public system. Jefferson County Public Schools (JCPS) is implementing the 
Class Size Reduction (CSR) Program in 34 elementary schools. The CSR program is a new 
federal initiative to help elementary schools improve student learning by hiring additional 
teachers. A comprehensive literature review framed this qualitative and quantitative 
investigation. The overall evaluation question that guided this study was the impact of the CSR 
program on students, teachers, and principals in JCPS elementary schools. First, a participative- 
oriented evaluation model was utilized for conducting the qualitative or process part of this 
study. Qualitative data were collected using unstructured interviews, site observations, and 
document analysis. The qualitative data analysis was based on the grounded theory model. Basic 
themes that were found during the process of this research included the impact of CSR on 
students, teachers, and parental involvement. Second, a management-oriented evaluation model 
was utilized for conducting the quantitative or outcome part of this study. The continuous 
assessment system of the county and the state of Kentucky provided data collection and 
instruments. A quasi-experimental design, using aggregated matching procedure for both 
comparison and treatment group was conducted (N = 102 students). Students were matched only 
from schools with (a) similar socio-economic characteristics, (b) participated in the assessment 
process at the beginning and at the end of the school year, and (c) having both conditions 
(regular and reduced class size). Findings indicated that, after one-year intervention in third 
grade, the class size reduction program did not increase student learning as measured by the 
same standardized test in the subject areas of reading and mathematics. Implications for policy, 
practice, and further research are discussed. 
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Class Size Reduction Program 

Introduction 

Jefferson County Public Schools (JCPS) is the 26th largest school district in the United 
States. The school district serves more than 96,000 students from preschool to grade 12. JCPS 
has a vision for long-term student achievement. The vision entitled "Beyond 2000" was designed 
to assure that every student will acquire the fundamental academic and life skills necessary for 
success in the classroom and workplace. JCPS vision commits the school system to educate each 
student to the highest academic standards. 

The Class Size Reduction (CSR) program has the goal of impacting student learning. The 
stated goal of the CSR initiative is to help schools of Jefferson County to improve student 
achievement. The assumption is that JCPS will benefit by adding additional, highly qualified 
teachers, to the workforce to ensure class size -particularly in the early grades- is reduced to no 
more than 18 children per class. The population targets of the program are schools facing 
challenges in term of student achievement as measured by the previous state assessment system, 
the Kentucky Instructional Results and Information Systems (KIRIS). The target schools are 35 
elementary schools in JCPS: 16 schools with a KIRIS index below 40 and 19 schools continuing 
with KIRIS indexes above 40. 

The CSR program is a new initiative to help schools improve student learning by hiring 
qualified teachers so that children in the early elementary grades can attend smaller classes. The 
assumption underlying the program is based on a growing body of research demonstrating that 
students attending small classes in the early grades make more rapid educational progress than 
students in larger classes, and that these achievement gains persist well after students move on to 
larger classes in later grades. 
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School districts are currently receiving federal funds -$1.2 billion — that will enable them 
to recruit, hire, and train new teachers for the 1999-2000 school year. That is just the first 
installment of an initiative that is anticipated to provide $12.4 billion over 7 years to help schools 
hire 100,000 new teachers and reduce class size in the early grades to a nation wide average of 
18. Schools are preparing to issue public “report cards” to inform parents and communities about 
progress in reducing class size and improving student achievement. 

Early implementation reports that school districts and states are submitting show that 
districts are hiring thousand of teachers with these funds. These teachers are being placed 
primarily in grades one through three and class sizes are being reduced significantly as a result. 
Funds are targeted to high poverty communities, but most districts will receive awards. 

Children participating in the CSR program will receive a more personal attention in 
smaller classes. In addition, children will acquire a solid foundation for further learning. Finally, 
children will learn to read independently and well by the end of the third grade. 

Literature Review 

The objective of this literature review is to present current research on the topic on class- 
size reduction. The general objective is to summarize major variables, operationalization of 
variables, research designs, findings, and recommendations concerned with class-size reduction 
projects. In the review of literature, two major studies were found: the Tennessee STAR 
experiment and the Wisconsin SAGE quasi-experiment (Grissmer, 1999). The following 
paragraphs will try to describe both experiments in general terms. 

The Tennessee STAR experiment was a multi-district study. The researchers randomly 
assigned a single cohort of kindergarten students in 79 participating schools to three treatment 
groups: (a) large classes without an aide (approximate mean of 22-24 students); (b) large classes 




Class Size Reduction 5 



with an aide (approximate mean of 22-24 students); and, (c) small classes (approximate mean of 
15-16 students). Those students entering at kindergarten were scheduled to maintain their 
treatment through first, second, and third grade. However, the treatment groups changed in 
significant ways after kindergarten due to attrition and late-entering students. In this sense, this 
was a problem in terms of threats to internal validity. 

The sample of participating students in any grade was over 6,000 students, but late entries 
and exits meant that about 12,000 students were included over the 4 years. There were 
approximately 2,000 students in small classes in the Tennessee study. The size of the control 
group was around 4,000 in the Tennessee experiment, if both regular and regular with aide 
classes are combined. 

The characteristics of the students were different from those of average Tennessee 
students. The experimental sample contained approximately 33% minority students and over 
50%-60% of all students were eligible for free or reduced-price lunch compared to 23% minority 
students and about 43% free or reduced-price lunch students for Tennessee in 1996. The sample 
was also different from students nation-wide in the US, where approximately 30% were minority 
students and 37% were eligible for free and reduced-price lunch in 1990. 

The Wisconsin SAGE Quasi-Experiment also included only schools with very high 
proportions of free-lunch students. Assignments were not randomized within schools, but rather 
a pre-selected control group of students from different schools was matched as a group to the 
students in treatment schools. The treatment has been more accurately characterized as pupil- 
teacher ratio reduction since a significant number of schools chose two teachers in a large class 
rather than one teacher in a small class. The size of the reduction in pupil-teacher ratio was 
slightly larger than CSRs in Tennessee. 
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There were about 1,600 students in the small pupil-teacher treatment group in the 
Wisconsin study. The size of the control group was around 1,300 students in the Wisconsin 
quasi -experiment. The SAGE sample had approximately 50% minority students with almost 70% 
eligible for free or reduced-price lunch. 

Tests were given at the beginning and end of the first grade, rather than at the end of 
consecutive years as in project STAR. Since achievement changes differently for advantaged and 
disadvantaged students over the summer, a beginning-year test for schooling effects is probably a 
better control than a previous end-year test. 

The current results from experimental and quasi-experimental studies show statistically 
significant effects from large CSRs in early grades in all subjects tested from kindergarten 
through eight grade. However, the size of the effects is hard to pin down because it is dependent 
on student characteristics, the length of time, the grade level, the tests or measurements, and 
whether the effects are short or long-term. These basic results are also estimated using models 
that control for school effects and teacher and classroom covariates. Finn and Achilles (1999) 
report that beneficial effect for minority students are approximately double than those for White 
students in Grades K-3. Krueger (1999) also reports larger positive effects for minority students. 

Grissmer (1999) argues that possible bias due to deviations from ideal experimental 
conditions should be analyzed. Hanushek (1999) summarizes potential threats to internal 
validity, including lack of randomization of schools, differential attrition, Hawthorne effects, and 
contextual factors (e.g., types of students, teachers, grades included, and curriculum). For 
example, schools were not randomly selected. The selection of schools is important for 
generalizability. 
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The schools selected were not representative of Tennessee schools, and were even less 
representative of students in the nation. In addition, teacher characteristics may be a likely factor 
influencing results. Some teachers may be more effective at utilizing the additional time small 
classes provide (Hanushek, 1999). Finally, the composition of the classes might be another 
important element. Krueger (1999) provides evidence that having more classmates who attended 
kindergarten and were not eligible for free lunch affected achievement in the Tennessee study. 

Grissmer (1999) argues that the Tennessee and Wisconsin data certainly support larger 
class size effects for minority and free-lunch students. However, neither sample contained large 
enough proportions of more advantaged White students to determine if and how rapidly effects 
might approach to zero. Lee and Barro (1998) analyzed international data and found higher 
achievement with smaller class size. However, it is more difficult to do international 
comparisons since the level of family support and time spent outside the classroom on education 
can vary considerably across cultures. 

As result, it is clear that exist multiple sources of experimental "noise" when conducting 
evaluation research on CSR programs. The class size effects might depend on the characteristics 
of the student population in the study. Class size effects may be large and significant only for 
minority and disadvantaged populations, and small or non-existent for more advantaged students. 
However, the Tennessee and Wisconsin experiment included mostly minority and disadvantaged 
populations. Since students with these characteristics probably comprise less than one third of all 
students nationally, other measurements that include the full range of students may show much 
smaller effects (Grissmer, 1999). 

In addition, differences on class size effects differ by other contextual variables. Teacher 
characteristics (e.g., experience and education), grade, class size ranges, and type of school 
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(urban versus suburban) have an impact on the analysis. Another important element might be 
level of funding available per pupil. The CSR program net effect gets even more complicated if 
non-school variables affecting learning are included in the analysis (Munoz, Clavijo, & Koven, 
1999). One method might be to compare the differences in covariates in large and small classes. 
However, the groups will still be non-equivalent groups. 

Finn and Aquilles (1999) stress the finding that teachers report that students exhibit more 
"on-task" behavior and engagement in learning, not only in small classes in second grade, but 
after being returned to large classes in fourth grade in the STAR project. These on task behaviors 
may be due to more teacher attention, greater opportunity to participate, and other reasons. 

Molnar, Smith, Zahorik, Palmer, Halbach, and Ehrle (1999) argue that the Wisconsin 
project suggests that teachers in small classes spend less time on discipline and more time on 
instruction. Teachers report more individualized instruction and greater knowledge of each 
student's strength and weaknesses. In addition, teachers reported more hands-on activities, more 
small group discussion, and more content covered. Betts and Shkolnik (1999) argue that the class 
size reduction results indicate statistically significant effects on a number of time variables. 
Teachers in smaller classes have more instructional time due to spending less time on discipline 
and administrative routines, and shift to instruction that is more individual and less lecture time. 

Grissmer (1999) argues that, despite the potential flaws in non-experimental data, policy 
analysis will largely be dependent on improving non-experimental analysis. Large-scale 
experiments such as the Tennessee STAR experiment can be costly and time consuming to plan, 
implement, and analyze. "While more experimentation seems essential to making progress in 
educational research, experiments can never be depended on to solve all the complex and 
contextual effects." (p. 239). 
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Finally, in this literature review, it should be mentioned some elements concerned with 
costs of CSR. According to Grissmer (1999), costs are significantly reduced by targeting 
reductions to students in lower income families as measured by free-lunch participation. Since 
the results from experimental data consistently show larger short-term effects for minority and 
free lunch students, CSRs will be significantly more cost-effective when targeted to these 
students. For example, the program can have a better cost-benefit if targeted at schools having at 
least 50% of students eligible for free and reduced-price lunch. This is in accordance to previous 
research on student achievement and its connection with student socio-economic status (SES) in 
the district under examination (Munoz & Dossett, 2001). 

In summary, a mixed method design involving qualitative and quantitative research might 
prove useful to understand the dynamic of the CSR program in a large urban district. At this 
moment, few studies have involved the analysis of the CSR program from the perspective of key 
stakeholders such as teachers and principals (CSR Research Consortium, 1999). In any 
successful educational reform effort it is clear that it relevant to include what happens at the 
classroom level. In addition, a quasi-experimental design might show light about the impact of 
the federal initiative in terms of student scores on the same standardized tests given at the 
beginning and at the end of the school year under study. 

For the first analysis, the purpose was to conduct a grounded-theory qualitative study on 
the CSR program in JCPS. The overarching question that guided this investigation was: what 
kind of insights might be obtained from stakeholders (i.e., teachers and principals) participating 
in the CSR program? Ancillary questions were concerned with the implementation strategies, the 
impact on students academic and non-academic measures, the impact on parental involvement, 
and obtaining feedback for improving the program. For the second analysis, the purpose was to 
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conduct a quasi-experiment quantitative study on the CSR program in JCPS. The overarching 
evaluation research question is the following one: What is the impact of CSR at the participating 
elementary schools in terms of student educational achievement? In specific, the objective was to 
determine if there was a difference in achievement on post-test scores between a treatment group 
and a matched comparison group of students. 

Evaluation Models 

A Mixed Methodology Approach 

The qualitative and quantitative paradigms can illuminate decision makers. Greene & 
McClintock (1985) argue that mixed designs can be used for five distinct purposes: triangulation, 
complementarity, development, initiation, and expansion. The triangulation and complementarity 
purposes are the ones that could be used for evaluating the CSR program. Quantitative and 
qualitative methods can be used in combination, but more in a complementary fashion (i.e., not 
in a "mixing" fashion). In this evaluation study, it will not be a case of a "small q" approach (i.e., 
open-ended responses on a questionnaire as qualitative measures). In the CSR program 
evaluation, the differences in nature between quantitative and qualitative methodology will be 
established from the very beginning: 

Researchers who work deductively gather data to test specific hypotheses, not to generate 
new hypotheses, and serendipitous findings are considered interesting but unreliable. By 
contrast, researchers who work inductively continue to generate new hypotheses and look 
for new questions even as they gather data (Worthen et al., 1997, p. 396). 

The Participant-Oriented Evaluation Approach 

The participant-oriented evaluation approach (Worthen, Sanders, & Fitzpatrick, 1997) 
was utilized in the CSR program evaluation. The level of involvement of multiple stakeholders 
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distinguishes the participant-oriented approach to evaluation. In this study, many stakeholders 
were involved during the evaluation process. Many sources of evidence and perspectives were 
analyzed, such as teachers and principals participating in the CSR program. The participant- 
oriented evaluation approach has an important characteristic: it is the most ardent in advocating 
the inclusion of many different perspectives. No one view of the program reflects truth and, thus, 
the evaluator must seek many different perspectives to understand the evaluation object in its 
totality (Guba & Lincoln, 1989; Patton, 1994; Stake, 1967; Stake, 1995). An evaluator who 
follows a participant-oriented evaluation approach typically uses: (a) inductive reasoning, (b) a 
multiplicity of data, (c) a plan that emerges during the evaluation, and (d) multiple rather than 
single realities. Stake (1995) argues that the central focus of participant-oriented evaluation is the 
focus on concerns and issues of the stakeholders. The ultimate test of an evaluation study, 
validity, is the extent to which the evaluation increases the audience’s understanding of the entity 
under evaluation. Corroboration of data through crosschecking and triangulation are two 
methods used by naturalistic evaluators to establish credibility of the findings (Guba & Lincoln, 
1989). Under the participant-oriented evaluation, a sample of key persons involved in the 
implementation of the CSR program were interviewed and observed. The evaluator conducted 
informal discussions with different key person or stakeholders, namely teachers and principals. 

In the CSR program, the qualitative portion included examples that could characterize it as a 
formative evaluation. In the next part, the summative study, involved collecting and analyzing 
quantitative data. 

The Management-Oriented Evaluation Approach 

The management-oriented evaluation approach (Worthen, Sanders, & Fitzpatrick, 1997) 
was used in the outcome evaluation of the CSR program. Daniel Stufflebeam (1983; Stufflebeam 
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& Shinkfield, 1985) is one of the most reputed leaders on the management-oriented approach. 
According to Stufflebeam, the evaluation is a process of delineating, obtaining, and providing 
useful information forjudging decision alternatives. The Context, Input, Process, and Product 
(CIPP) Evaluation has different objectives, methods, and relation to decision making in the 
change process depending on the type of evaluation emphasis. 

The management-oriented rationale is that the evaluative information is an essential part 
of good decision-making and that the evaluator can be most effective by serving 
administrators, policy makers, boards, practitioners, and others who need good evaluative 
information (Worthen et al., 1997, p. 97). 

Campbell (1969) seminal article on reform as experiments is germane to this evaluation. 
Today, 30 years later, many ameliorative programs terminate with no interpretable evaluation. 
The good intentions of educational administrators are not enough. Establishing social indicators, 
data banks, and management information systems (MIS) is not enough. As Campbell (1969) 
argues, administrators are sometimes so committed in advance to the efficacy of the reform, that 
cannot afford a honest evaluation. Capitalizing on regression, grateful testimonials, and 
confounding selection and treatment are the major strategies to bias the analysis. 

The United States and other modern nations should be ready for an experimental 
approach to social reform, an approach in which we try out new programs designed to 
cure specific social programs, in which we learn whether or not these programs are 
effective, and in which we retain, imitate, modify, or discard them on the bases of 
apparent effectiveness on the multiple imperfect criteria available, (p. 409) 




Experimental or quasi-experimental research designs are intended to establish cause- 
effect among variables. The reader should keep in mind that only researchers conducting "true 
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experimental designs can provide the most convincing evidence about causation, that is, whether 
a variable X has a causal effect on a variable Y. True experimental designs have random 
assignment of participants into treatment and control groups (Winer, Brown, & Michels, 1991). 
The CSR program evaluation will be quasi-experimental (Campbell & Stanley, 1966). Quasi- 
experimental designs do have control or comparison groups, but do not have random assignment 
of participants. In the CSR program evaluation, as in many social or educational programs, 
randomness does not exist. The CSR quasi-experimental evaluation design has different levels of 
strength in terms of quality, i.e., internal and external validity (Campbell & Stanley, 1966). Cook 
and Campbell (1979) added two more classes of validity in addition to internal and external 
validity: (a) statistical conclusion validity (whether the study has appropriate statistical testing 
procedures and acceptable error probabilities) and (b) construct validity of causes and effects 
(whether the researcher has defined the treatment adequately). The evaluator focused on the 
cognitive domain (e.g., the effect on student achievement) and exercised as much control as 
possible on extraneous variables. In addition, an understanding of the theory of why the 
treatment should work will help in interpreting the data (Lipsey, 1993). 

Method for the Qualitative Study 

Units of Analysis 

A participant-oriented approach to evaluation is similar to case study research. A case 
study is the preferred research strategy under certain conditions such as: (a) when investigators 
have little control over events and (b) when the focus is on a contemporary phenomenon in some 
real-life context (Yin, 1994). This case study fits these criteria. The study was conducted as the 
program was implemented in the 1999-2000 school year. In total, 40 key stakeholders from eight 
of the 34 schools participating in the program were interviewed. The sample was selected based 
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on the maximum variation sampling technique, which involves selecting cases that illustrate the 
range of variation in the phenomena to be studied (Gall, Borg, & Gall, 1996). The criterion used 
to select the sample of CSR elementary schools was to be located in the upper, average, and 
lower end of the continuum on the KIRIS Index scores. The schools in the program included two 
low-performing schools, four average-performing schools, and two high-performing schools. 

The stakeholders participating in the interviews were (a) 32 teachers responsible for planning 
curriculum and delivering instruction and (b) eight principals involved with program 
development and administration. The researcher conducted non-structured classroom 
observations in all the participating schools for purposes of data triangulation and member 
checking (Gall, Borg, and Gall, 1996). 

Instrumentation 

In general, the qualitative measures included document analysis (e.g., schools plans), 
semi-structured interviews, non-structured observations, and field notes (e.g., school climate). 
Overall, the evaluation was conducted using a comparative case study based on the grounded- 
theory paradigm (Gall, Borg, & Gall, 1996; Glaser & Strauss, 1967; Strauss & Corbin, 1990; 

Yin, 1994). The grounded-theory is guided by initial concepts, but shifts or discards them as the 
data are collected and analyzed (Marshall & Rossman, 1989, p. 113). The great advantage of the 
qualitative measures is that it allowed the evaluators to understand at a deeper level the context 
under the CSR program is operating. In addition, the evaluator had the opportunity to involve 
various stakeholders. For instance, teacher perceptions toward the program were captured by 
interviewing them (n = 32). The same concept applies to principal perceptions of the program 
under examination and through the use of interviews (n = 8). 
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Data Collection and Analysis 

To examine some dimensions of the CSR program, a qualitative approach was employed 
to understand the experiences of primary stakeholders — school principals and teachers. The 
qualitative paradigm will respond to the need of understanding the social processes that defines 
the CSR program. In this sense, this study was grounded in the belief that truth and knowledge 
are created (Schwandt, 1994) and that understandings of the world are socially constructed 
(Gergen & Gergen, 1991). Data collection began in December 1999 and continued through 
March 2000. Data were gathered from multiple methods including interviews, observations and 
documents. The primary source of data consisted of in-depth, semi-structured interviews with the 
40 key stakeholders (e.g., what are your perceptions about the advantages and disadvantages of 
the CSR program). A secondary source of data consisted of classroom observations in the 
selected schools (e.g., level of student engagement in learning as on-task behavior). Field notes 
were also made throughout the site visits during the evaluation of the program. Field notes 
documented such factors as the nature of student-teacher interactions and the types of concerns 
expressed by stakeholders such as the principals. 

Data collection and analysis occurred simultaneously. It was continued throughout the 
study (Glaser & Strauss, 1967). The data analysis was based on the constant comparison method 
(Glaser & Strauss, 1967). “The constant comparison method refers to the continual process of 
comparing segments within and across categories. Using constant comparison, the researcher 
clarifies the meaning of each category, creates distinctions between categories, and decides 
which categories are most important to the study” (Gall, Borg, & Gall, 1996, p. 566-567). 
Ongoing analysis influenced the scope and direction of succeeding observations, interviews and 
document collections. Triangulation of findings was achieved by the use of multiple data 
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collection methods, as well as by independent data analysis with other stakeholders involved in 
the evaluation at JCPS (Bogdan & Biklen, 1998). Coding processes included identifying 
concepts embedded within the data, organizing discrete concepts into categories, defining the 
properties and dimensions of categories and linking them according to their properties and 
dimensions into broad, explanatory themes (Strauss, 1987; Strauss & Corbin, 1990). Qualitative 
research thrives on “thick descriptions, ’’(Gall, Borg, and Gall, 1996). 

Method for the Quantitative Study 

Participants 

Thirty-four elementary schools in JCPS are currently participating in the CSR program. 
The characteristics of the participating students are different from those of average JCPS 
elementary students. JCPS elementary schools have approximately 34.8% minority students and 
about 59.9% of all students eligible for free or reduced-price lunch. The CSR program schools 
contain approximately 40% minority students and about 72% of all students eligible for free or 
reduced-price lunch. 

The participating schools were first examined to analyze the number of students per class 
in grade 3. For all the schools participating in the CSR program (N = 1,798 students), the 
evaluator tested the impact of the program by randomly creating two matched groups: (a) more 
than 18 students (i.e., comparison group); and, (b) less than 19 students (i.e., treatment group) 
that participated in the assessment process at the beginning and at the end of the school year. The 
goal was to discover the number of schools that could be analyzed using a “nested design” to 
overcome as many threats to internal validity as permitted. Table 1 shows the schools and 
testing scores of the students that had two types of classes in grade 3: (a) less than 19 students 
and (b) more than 18 students. It has to be noted that not all the participating elementary schools 
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took the pre- and post-test instrument in the school year. Four schools met this condition. In 
terms of free/reduced lunch population, School A has 58%, School B 58%, School C has 52%, 
and School D has 62% for the 1999-2000 academic year. Thus, the comparability in this 
important socio-economic indicator is evident across the purposefully selected schools 
participating in this analysis. There was no effort to check comparability in terms of reading and 
writing across the four schools since the third graders do not receive the CATS assessment in the 
District under examination. 

An aggregated matching procedure was used to guarantee that the two groups were 
equivalent. A total of eight participants were excluded from the analysis after ensuring that the 
groups were similar in the fundamental academic variables, namely reading and mathematics 
pre-test scores taken in the Fall of 1999 (i.e., from N = 1 10 to N = 102 students). 
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Table 1 

Students from the Schools Participating in the CSR Nested Design Study (N = 1 10) 



School 



N M SD t value 



School A 

Stanford Reading 



Less than 19 students 13 

More than 18 students 13 

Stanford Math 

Less than 19 students 13 

More than 18 students 13 

School B 

Stanford Reading 

Less than 19 students 15 

More than 1 8 students 15 

Stanford Math 

Less than 19 students 15 

More than 1 8 students 1 5 

School C 

Stanford Reading 

Less than 1 9 students 13 

More than 1 8 students 1 3 

Stanford Math 

Less than 19 students 13 

More than 18 students 13 

School D 

Stanford Reading 

Less than 19 students 14 

More than 18 students 14 

Stanford Math 

Less than 1 9 students 14 

More than 18 students 14 



p < .05 



4.92 1.12 .00 

4.92 1.12 



5.08 1.89 -1.87 

3.69 1.89 



3.87 1.85 -.21 

3.73 1.67 



4.33 1.54 -1.78 

3.33 1.54 



3.73 1.67 -.21 

3.87 1.85 



3.33 1.54 -1.78 

4.33 1.54 



2.86 .86 -.85 

3.21 1.31 



2.57 1.16 -.69 

2.93 1.54 
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Table 2 shows the results of the “comparability” on the fundamental variables of this 
analysis, that is, reading and mathematics. The Stanford Reading and Mathematics Diagnostic 
Tests were used as pre-test measures (i.e., taken in the Fall of 1999) and post-test measures (i.e., 
taken in the Spring of 2000). The standardized tests were given at the beginning and at the end of 
the school year following the system-wide assessment calendar. As mentioned previously, Table 
3 shows that, as expected because of the match-paired procedure, no statistical significance 
difference was found between the treatment and comparison groups both in reading and 
mathematics when independent-samples t-tests were conducted. 



Table 2 

Students Matching Procedure on Reading and Mathematics Pre-Test (N = 102) 



Groups 






Mean 


SD 


t-value 



Reading 



Treatment group 


2.8 


1.5 


-.23 


Comparison group 


2.8 


.9 




Mathematics 


Treatment group 


3.1 


1.5 


-.08 


Comparison group 


3.1 


1.4 





Note : 

Treatment group n = 47; Comparison group n = 55 
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As previously mentioned, a pair-matching procedure was again utilized to assure 
similarity between the groups of students. Table 3 shows the demographic and social 
characteristics of both groups of students. Socio-economic status, operationalized as participation 
in the national lunch program, is a critical variable in this kind of studies. The comparability is 
evident in socio-economic status and age expressed in years. The comparison group had more 
Black and female students than the treatment group. The evaluator decided to keep this group as 
it was to ensure enough statistical power to avoid a Type II error, that is, not finding statistically 
significant difference when it exists (Stevens, 1996). 



Table 3 

Profile of Participating Students (N = 1021 



Group 


N 


Age in Years 


Race 


Gender 


Lunch Tvpe 


Comparison 


57 


9.36 


54% Black 


60% Female 


79% Free 








42% White 


40% Male 


14% Pay 








4% Other 




7% Reduce 


Treatment 


47 


9.36 


45% Black 


45% Female 


75% Free 








51% White 


55% Male 


17% Pay 








4% Other 




9% Reduce 
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An additional step was taken to consider one of the typically confounding variables in the 
review of the literature. Table 4 shows the profile of the teachers in each of the groups. 
Differences were observed between both groups in years of teacher experience. CSR teachers are 
much less experienced than teachers in the regular classrooms. The range was very dissimilar 
between both groups of teachers. As it can be observed, while the CSR teachers had from zero to 
15 years of experience, the non-CSR teachers had from zero to 27 years of teaching experience. 
This is an important element that needs to be considered when interpreting this analysis and 
determining the implications for policy, practice, and further research. 

Table 4 

Teachers Years of Teaching Experience in Comparison and Treatment Groups 



Group 


Mean 


SD 


Range 


Comparison 


10.6 


8.9 


0-27 


Treatment 


6.6 


4.9 


0-15 
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Instrumentation 

In general, quantitative measures will be based on already established data collection 
mechanism of the county under examination. Data will come from the Management Information 
System (MIS) of the county. Then, the evaluator will place the information into the Statistical 
Package for the Social Sciences (SPSS) through the creation of a data file. 

The central measures will be related to student achievement since they will become 
outcome criteria for establishing success of the program. Currently, the Clay’s Observation 
Survey is used to measure student achievement for Kindergarten and first graders; however, this 
instrument has not been fully implemented in the district under examination. For third grade 
students, the Stanford Diagnostic Test (Reading and Mathematics) is used in the District under 
examination. For third grade students, the Stanford test, given at the end of the school year, was 
used to measure student learning. The summer effect on student learning is a typical confounding 
variable. The summer effect was controlled in this study by using post-test measures given at the 
end of the school year to both the comparison and the treatment group. 

Data Analysis & Procedures 

As mentioned previously, for the quantitative dimension of this evaluation study, 
descriptive and quasi-experimental designs will be used (Gall, Borg, & Gall, 1996). First, 
descriptive statistics will be performed. Second, quasi-experimental designs and analyses will be 
utilized for assessing tentative cause-effect relationships. Stevens (1996) recommends that 
independent-samples t-test should be used when two groups of subjects are being compared on a 
dependent variable. All data will be entered and analyzed using the Statistical Package for the 
Social Sciences (SPSS), version 10.0. 
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Qualitative Study Findings 

More than a class-size reduction per se, the CSR program in JCPS has been used to 
decrease the pupil-teacher ratio in the sample of participating schools. The typical 
implementation strategies found in the sample of JCPS elementary schools were (a) self- 
contained classrooms having fewer students in their classroom at the early grades and (b) 
collaborative models with small group activities both within the classroom and outside the 
classroom. 

The self-contained classroom strategy consisted of a process of assigning the newly hired 
teacher to a particular primary level or grade. At the same time, not only the newly hired teacher 
experienced the fact of having fewer students in their classroom. Senior teachers of the school 
are now having the experience of having a decrease in number of the students they work with. In 
this regard, it became relevant to listen to their impressions on CSR, too. 

Concerning the collaborative model, an example coming from one of the schools visited 
is illustrative. At this particular school, the cooperative model was a learning experience for both 
teachers and students. The teacher is a highly-skill teacher who became itinerant to role model to 
other group of teachers while helping kids in a very particular subject area. In this sense, this 
becomes a kind of embedded professional development for new teachers but also an exposure of 
the children to an outstanding teacher. The implementation evaluation permitted the evaluator to 
observe and interview principals showing what can be characterized as administrative “wisdom.” 
Administration is a science and an art. It is the art of making decisions under “the law of the 
situation” (Shafritz & Ott, 1996). For example, Principals adjusted the program to meet their 
particular needs. Furthermore, in some schools, principals utilized the CSR teachers to support 
the Consolidated Plan focus (e.g., reading, writing, etc.), especially for student performing below 
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average. In addition, in some schools, CSR teachers were appointed through interesting 
empowering exercises within their site-based decision making councils. 

The teachers involved in the CSR program have diverse background. In many schools, 
CSR teachers are transfers that bring experience to their particular schools. In other cases, CSR 
teachers are totally new teachers to the District. For the experienced teacher, the CSR program 
has provided them the opportunity to revisit instructional methodologies and techniques that 
permit higher student engagement in learning. For the new teachers, the CSR program has 
facilitated their process of entrance to the teaching profession. For example, a new teacher said: 
“working with fewer kids gives me the opportunity of applying my teacher preparation programs 
in a better environment.” 

Principal and Teachers Fundamental Perceptions about the CSR Program. 

The overall impression after visiting, interviewing, and observing a sample of eight 
participating schools is that principals and teachers are very enthusiastic about having moved 
from 24 to 18 students (or even less in some cases). Teachers stated the importance of the 
program: “this program is so important for us.” In addition, according to several teachers 
interviewed, “having fewer kids makes a huge difference” in the classroom. Five basic themes 
were found during the process of this research: (1) CSR impact on students, (2) CSR impact on 
teachers, and (3) CSR impact on parental involvement. 

CSR Impact on Students. Principals and teachers reported benefits to students in both 
cognitive and non-cognitive dimensions. In terms of cognitive benefits, teachers have more 
instructional and contact time. In addition, teachers can provide better attention to individual 
needs, especially to those facing barriers to learning. These factors are helping student’s 
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cognitive development. There is a general climate of high expectations on student achievement 
(e.g., CTBS scores on state assessment). 

In terms of non-cognitive benefits, teachers are experiencing an increase in levels of 
attendance, less disciplinary problems, and less time spent on classroom management activities. 
Besides, there are higher levels of “student-teacher connection,” which makes easier to develop 
better understanding and communication with each student. According to the teachers 
interviewed: “having few kids makes a big difference in terms of behavior.” In general, both 
principals and teachers agree that benefits for students are more on a long-term basis. In fact, the 
impact on achievement is expected only when students will be at the Intermediate Grades, or in 
about three or more years. According to the teachers and principals interviewed, this fact does 
not discard the possibilities of finding short-term results after one year of the CSR program 
implementation. In general, schools look forward to the program renewal because it has been a 
support to their priorities, namely, learning and student achievement. 

CSR Impact on Teachers 

Principals and teachers cited many benefits for themselves. Principals stated that higher 
levels of certificate personnel morale has been present since teachers have fewer students in their 
classrooms. In this regard, principals argued that the CSR program provides them with an 
opportunity to keep “pressure for improvement” at the teacher level. Teachers at two different 
levels have felt the impact of CSR: (a) working conditions and (b) instructional methodologies 
and techniques. In terms of working conditions, teachers experience “higher levels of satisfaction 
and morale,” and “lower levels of stress.” Teachers are “enjoying being in the teaching-learning 
profession,” and “the pressure for accountability is better handled.” Teachers feel more 
responsible in classrooms with fewer students. Particularly, new teachers experience a better 
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entrance to their career. Teachers said: “what is good for the mother (teacher) is doing good for 
the child (student)”. Teachers have implemented some changes in their instructional 
methodologies and techniques. According to teachers, teaching and learning are positively 
affected by having fewer students in the classroom. Some of the common instructional issues are 
the student-centered approach, which becomes closer to reality according to their perceptions in 
a class with fewer students. The student-centered approach promotes centering all teaching 
activities on students learning needs. Other techniques are individualized instruction, small group 
activities, manipulative learning, experiential learning, hands-on learning, and better 
implementation and use of diagnostic tools. 

Some of the teachers that were interviewed asserted that “every body can understand the 
big difference, except if either you have never been in a classroom or you have been away of the 
classroom for too long.” According to principals and teachers, an adjustment in instructional 
methodologies and techniques are occurring in various schools. CSR is motivating teachers to 
explore new avenues for teaching students. Even experienced educators in the field for many 
years are now exploring and experimenting ways of “doing things around here.” This trend is 
expected to have positive impact on student achievement. For example, a teacher said “the 
program enables to really think about implementing best practices.” Another teacher argued that 
“now, there is more time to learn and use more diagnostic instruments.” Cited diagnostic tools 
included the Clay Observation Survey, Silvaroli Classroom Reading Inventory, Writing 
Diagnostics, and Stanford Diagnostic Tests. 

Nevertheless, it must be mentioned that some teachers presented some level of 




skepticism. For instance, a teacher said: “Not all teachers are taking advantage of the classroom 
with fewer students. Some teachers continue to teach in the same old style.” 
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CSR Impact on Parental Involvement 

The majority of teachers expressed that the levels of communication and interaction 
between parents and teachers have notably increased since the implementation of the CSR 
program. Personal relationships between parents and teacher are starting to occur. Therefore, 
teachers have more knowledge on family issues that are affecting students’ learning process. 

This element is especially helpful with students having more barriers for learning. According to a 
teacher, “creating a parent-teacher relationship takes time. The lower the number of students, the 
more chances to develop an in-depth cooperative relationship that will promote learning.” Many 
of the teachers found the CSR program as critical for early identification of family issues 
affecting student learning. For example, a teacher said: “if you have an empty stomach and a 
difficult environment at home, it is just more difficult to concentrate and learn.” However, 
according to a teacher, “there is no significant difference in teacher-parent relationships after the 
program was implemented at the beginning of the year.” 

Quantitative Study Findings 

The evaluator conducted an independent-samples t-test to observe student gains in 
learning as measured by the standardized test by means of comparing the aggregated matched 
groups. No statistically significant differences were found in reading and mathematics in the 
comparison and treatment groups participating in this matched-based study on CSR. Table 5 
displays the results of this analysis. 
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Table 5 

Independent-Sample T-Test fN = 102) 



Group 


Post-Test Mean 


SD 


t-value 


Reading 


Comparison group 


3.1 


1.33 


.35 


Treatment group 


3.0 


1.55 




Mathematics 


Comparison group 


3.2 


1.65 


1.72 


Treatment group 


2.6 


1.70 





Discussion 

From a qualitative research perspective, the early findings of the CSR program suggest 
that the teacher job satisfaction and morale is achieving higher levels. In addition, qualitative 
evidence suggest that teachers are spending less time instructing whole classes and that the 
program is having an impact in the use of new teaching strategies. Also, the teachers are 
spending less time on discipline and behavioral-related issues. In respect to parent-teacher 
relations, the evidence suggests that there are higher levels of contact in parental involvement. 

Further research is needed to improve our understanding of the impact of the CSR 
program in terms of actual translation in student achievement gains, especially for disadvantaged 
students. Since job satisfaction is considered as a fundamental predictor of job performance, it is 
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probable to expect that teachers’ higher level of satisfaction might express on higher levels of 
student achievement. 

The study showed the existence of some challenges for the implementation of the CSR 
program. Some principals are experiencing problems in relationship to number of students 
enrolled and space limitation. In some cases, the school did not have an available room for a new 
teacher. Therefore, the school implemented collaborative models with itinerant teachers. Another 
limitation was related to knowledge and training on effective techniques in foundational subjects 
taking advantage of research-based best practices. In this regard, the treatment (i.e., class size 
reduction) needs to be fleshed with instructional approaches that optimize teacher effectiveness 
in a class with fewer students. 

Principals mentioned several caveats to make this program even more successful: (a) 
fine-tune the recruitment and selection process of teachers hired to participate in the program; (b) 
provide an orientation program to new hires on District Vision and Mission, school’s plans, and 
effective techniques in foundational subjects; and (c) develop on-the-job training or conferences 
for all teachers on proper implementation and use of diagnostic and assessment tools with 
students, and on successful teaching techniques for small groups. Finally, teachers articulated the 
need for developing content-specific training and development activities. For example, for those 
teachers working on self-contained classrooms, it was recommended the possibility to develop 
on-the-job training activities providing new methodological “tools” to take advantage of fewer 
students. Another example is for those teachers working on a collaborative model. Teachers 
working as partners of other teachers presented the need to develop on-the-job training or 
conferences on how to work cooperatively to avoid “territorial” approaches to teaching. 
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From a quantitative research perspective, it can be seen that a one-year intervention does 
not produce immediate results in student learning. No statistically significant difference was 
found between the matched-pair comparison and treatment group. Both comparison and 
treatment group received a pre-test at the beginning of the school year and a post-test at the end 
of the school year. Only the schools that received the standardized test were part of this 
investigation that controlled for the summer effect on student learning. In addition, probably 
having the less experienced educators in charge of the reduced classes might have had an impact 
in the District under examination. 

Initial orientation and training for CSR teachers is a fundamental recommendation from 
this research. Only those well-trained teachers might have the knowledge and skills to take 
advantage of the lower class size educational intervention at the primary level. Efforts to increase 
the quality of teachers involved in the CSR program, as in any educational program, are 
important in the long run and significant scores gain might be obtained with a better prepared 
teaching force (Darling-Hammond, 1997). 

The results of this research can be discussed in light of the findings of a recently 
published RAND study (Grissmer, Flanagan, Kawata, & Williamson, 2000). According to 
Grissmer et al. (2000, p. xxxii), "the Tennessee results suggest that two students can have similar 
pretest scores and similar schooling conditions during a grade and still emerge with different 
posttest grades that have been influenced by different earlier schooling conditions." From the 
standpoint of child development, these results are consistent with the concepts of risk and 
resiliency in children. In this respect, the RAND study argues that four years of small classes 
appear to provide resiliency against later larger class sizes, whereas one or two years does not. 
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The limitations of the mixed method evaluation were multiple. First, the participative- 
oriented evaluation approach used in this study was limited in at least two ways. First, the nature 
of case study research is to gain an in-depth, contextual understanding of one or more cases (Yin, 
1994). Thus, direct generalizability to other school districts’ CSR programs is not advisable. 
Second, this study did not include qualitative student data. The students' views were not included 
given that the purpose of this study was to understand the perspectives of stakeholders who were 
involved in the implementation and administration of the program over time. Second, the 
management-oriented evaluation approach complement the qualitative study by means of 
analyzing testing data using quantitative tools. The CSR program quantitative evaluation design 
was not a true experimental design. In this regard, it should be noted the existence of multiple 
threats to internal validity. Internal validity is related to measuring the net effect of the treatment. 
In this sense, the CSR program evaluation faces the problem of establishing causality while 
controlling for extraneous or confounding variables. Sample size, external validity, and 
generalizability were also an issue in this research since only the lowest performing schools are 
participants of the CSR program. The sample can be considered as small but the reason is that, 
not all schools in the District, take the post-test assessment at the end of the school year. It is 
difficult to establish if the treatment can be generalized to different participants, settings, or 
times. 

In general, the CSR program design is weaker in internal validity but stronger in external 
validity. Major threats to internal validity affecting the CSR program included history, 
maturation, and regression toward the mean. For example, "history" (i.e., events affecting 
participants in addition to the treatment) was present, since many of the participating schools 
have several programs already in place and the program involved the entire academic year. 
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"Maturation" (i.e., participants change over time) and statistical regression (i.e., on the average, 
participants in the extremes at pre-testing get closer to the mean when post-tested irrespective of 
the treatment) were threats to internal validity also present in this research study. Further 
research needs to overcome the aforementioned threats to internal validity and assess the 
longitudinal impact of the CSR program in large urban districts. 

In conclusion, it needs to be mentioned that the CSR program is on its initial stages of 
implementation. So far, after one year of implementation, CSR does not have a statistically 
significant impact when compared to regular classrooms. However, many program effects in 
educational settings only come after various years of implementation. Another study might be 
needed to address the issue longitudinal influences and/or using different measurement 
instruments to assess the impact of the program. 
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